Voivodeship-level analysis

This is the analysis by voivodeship. The main notebook of the whole analysis is located at Analysis.ipynb.

Import libraries and modules

We will make use of the following libraries in our analysis.

In [1]:
import pandas as pd
import json
from IPython.display import display, Markdown
import plotly.express as px

We also import our own constants and functions.

In [2]:
from own_data import candidates, candidates_colors, poland_center, poland_zoom, map_margin, parties_2019, parties_2019_colors, \
    opacity, parties_to_candidates
from utils import comma_to_dot, simplify_party

Parse the results data

We read the csv files with the results by county in percent format. The data is taken from the website of the National Electoral Commission. Poland uses comma as a decimal separator, so we convert the data to dot-separated numbers.

In [3]:
results_voivodeships_percent_df = pd.read_csv('data/results/results_by_voivodeship_percent.csv', sep=';')
results_voivodeships_percent_df = results_voivodeships_percent_df[['Kod TERYT', 'Województwo'] + candidates]

for candidate in candidates:
    results_voivodeships_percent_df[candidate] = results_voivodeships_percent_df[candidate].map(comma_to_dot)
In [4]:
results_voivodeships_percent_df.head()
Out[4]:
Kod TERYT Województwo Robert BIEDROŃ Krzysztof BOSAK Andrzej Sebastian DUDA Szymon Franciszek HOŁOWNIA Marek JAKUBIAK Władysław Marcin KOSINIAK-KAMYSZ Mirosław Mariusz PIOTROWSKI Paweł Jan TANAJNO Rafał Kazimierz TRZASKOWSKI Waldemar Włodzimierz WITKOWSKI Stanisław Józef ŻÓŁTEK
0 20000 dolnośląskie 2.61 6.44 38.21 14.09 0.16 1.91 0.09 0.17 35.92 0.15 0.25
1 40000 kujawsko-pomorskie 2.31 5.88 39.54 15.46 0.14 2.52 0.09 0.14 33.59 0.11 0.22
2 60000 lubelskie 1.63 7.99 56.67 10.45 0.20 3.04 0.22 0.14 19.32 0.10 0.25
3 80000 lubuskie 2.20 6.07 34.19 17.87 0.14 2.02 0.09 0.16 36.94 0.12 0.19
4 100000 łódzkie 2.30 6.15 46.63 12.92 0.17 2.47 0.10 0.14 28.74 0.13 0.23

Parse the geographical data

Simultaneusly, we import the geographical data about borders of each voivodeship from the official data of the Head Office of Geodesy and Cartography. The webiste of GIS Support PL let us solely download the package with voivodeships. To create maps I will use GeoJSON format. The data from the websites mentioned before has the .shp extension, so I have formatted it to GeoJSON using MapShaper.

In [5]:
with open('data/geojson/voivodeships.json', encoding='utf-8') as response:
    voivodeships = json.load(response)
In [23]:
voivodeships['features'][0]['properties']
Out[23]:
{'JPT_SJR_KO': 'WOJ',
 'JPT_KOD_JE': '24',
 'JPT_NAZWA_': 'śląskie',
 'JPT_ORGAN_': '',
 'JPT_JOR_ID': 0,
 'WERSJA_OD': '2017-10-10T00:00:00.000Z',
 'WERSJA_DO': '1899-11-30T00:00:00.000Z',
 'WAZNY_OD': '2012-09-26T00:00:00.000Z',
 'WAZNY_DO': '1899-11-30T00:00:00.000Z',
 'JPT_KOD__1': '',
 'JPT_NAZWA1': '',
 'JPT_ORGAN1': 'NZN',
 'JPT_WAZNA_': 'BRK',
 'ID_BUFORA_': 13890,
 'ID_BUFORA1': 0,
 'ID_TECHNIC': 1331323,
 'IIP_PRZEST': 'PL.PZGIK.200',
 'IIP_IDENTY': '98a63fe6-1e56-4d05-9c47-ab4233f8a6ff',
 'IIP_WERSJA': '2017-10-10T00:00:00+02:00',
 'JPT_KJ_IIP': 'EGIB',
 'JPT_KJ_I_1': '24',
 'JPT_KJ_I_2': '',
 'JPT_OPIS': '',
 'JPT_SPS_KO': 'UZG',
 'ID_BUFOR_1': 0,
 'JPT_ID': 1331323,
 'JPT_KJ_I_3': '',
 'Shape_Leng': 12.1369516127,
 'Shape_Area': 1.55733518838}

Integrate the two data sets

The TERYT code is a unique code of each administrative unit. In the elections result the code has four extra 00. Additionally, it doesn't have a leading zero when its voivodeship number is only one digit. We are going to fix this issues to connect these two data sets.

In [7]:
def fix_teryt_voivodeship(teryt):
    """Fix TERYT code to integrate the two datasets for voivodeships."""
    teryt = str(teryt)
    
    if len(teryt) == 5:
        teryt = '0' + teryt
    
    return teryt[:-4]
In [8]:
results_voivodeships_percent_df['Kod TERYT'] = \
    results_voivodeships_percent_df['Kod TERYT'].astype(str).map(fix_teryt_voivodeship)
In [9]:
results_voivodeships_percent_df.head()
Out[9]:
Kod TERYT Województwo Robert BIEDROŃ Krzysztof BOSAK Andrzej Sebastian DUDA Szymon Franciszek HOŁOWNIA Marek JAKUBIAK Władysław Marcin KOSINIAK-KAMYSZ Mirosław Mariusz PIOTROWSKI Paweł Jan TANAJNO Rafał Kazimierz TRZASKOWSKI Waldemar Włodzimierz WITKOWSKI Stanisław Józef ŻÓŁTEK
0 02 dolnośląskie 2.61 6.44 38.21 14.09 0.16 1.91 0.09 0.17 35.92 0.15 0.25
1 04 kujawsko-pomorskie 2.31 5.88 39.54 15.46 0.14 2.52 0.09 0.14 33.59 0.11 0.22
2 06 lubelskie 1.63 7.99 56.67 10.45 0.20 3.04 0.22 0.14 19.32 0.10 0.25
3 08 lubuskie 2.20 6.07 34.19 17.87 0.14 2.02 0.09 0.16 36.94 0.12 0.19
4 10 łódzkie 2.30 6.15 46.63 12.92 0.17 2.47 0.10 0.14 28.74 0.13 0.23

This is the location of the key that will join our data sets in voivodeships JSON:

In [10]:
voivodeships['features'][0]['properties']['JPT_KOD_JE']
Out[10]:
'24'

Plot maps

We plot the data on maps.

In [11]:
def get_figure_results_by_voivodeship(candidate):
    """Get figure showing a map of results of the given cadidate by voivodeship."""
    candidate_df = results_voivodeships_percent_df[['Kod TERYT', 'Województwo', candidate]]

    
    fig = px.choropleth_mapbox(
        candidate_df, geojson=voivodeships, color=candidate,
        locations='Kod TERYT', featureidkey="properties.JPT_KOD_JE",
        center=poland_center,
        opacity=opacity, color_continuous_scale=candidates_colors[candidate],
        hover_data={'Województwo': True, 'Kod TERYT': False},
        mapbox_style="carto-positron", zoom=poland_zoom
    )
    
    fig.update_layout(margin=map_margin)
    
    return fig
In [12]:
for candidate in candidates:
    display(Markdown(f'### Results of {candidate} by voivodeship'))
    get_figure_results_by_voivodeship(candidate).show()

Results of Robert BIEDROŃ by voivodeship

Results of Krzysztof BOSAK by voivodeship

Results of Andrzej Sebastian DUDA by voivodeship

Results of Szymon Franciszek HOŁOWNIA by voivodeship

Results of Marek JAKUBIAK by voivodeship

Results of Władysław Marcin KOSINIAK-KAMYSZ by voivodeship

Results of Mirosław Mariusz PIOTROWSKI by voivodeship

Results of Paweł Jan TANAJNO by voivodeship

Results of Rafał Kazimierz TRZASKOWSKI by voivodeship

Results of Waldemar Włodzimierz WITKOWSKI by voivodeship

Results of Stanisław Józef ŻÓŁTEK by voivodeship

Who did win in each voivodeship?

Find the winner in each voivodeship

In [13]:
winners_voivodeships_df = pd.concat([
    results_voivodeships_percent_df[candidates].idxmax(axis=1).rename('Winner').to_frame(),
    results_voivodeships_percent_df[candidates].max(axis=1).rename('Result').to_frame(),
    results_voivodeships_percent_df[['Województwo', 'Kod TERYT']]
], axis=1)
In [14]:
winners_voivodeships_df.head(1)
Out[14]:
Winner Result Województwo Kod TERYT
0 Andrzej Sebastian DUDA 38.21 dolnośląskie 02

Plot the map

In [15]:
winners_voivodeships_fig = px.choropleth_mapbox(
    winners_voivodeships_df, geojson=voivodeships, color='Winner',
    locations='Kod TERYT', featureidkey="properties.JPT_KOD_JE",
    center=poland_center,
    opacity=opacity, color_discrete_sequence=px.colors.qualitative.D3,
    hover_data={'Województwo': True, 'Kod TERYT': False, 'Result': True},
    mapbox_style="carto-positron", zoom=poland_zoom
)

winners_voivodeships_fig.update_layout(margin=map_margin)

winners_voivodeships_fig.show()

Is the proportion of voters similar to the one in parliamentary elections in 2019?

In the presidential elections 2020 the results are:

  • Andrzej Duda - 43.50%
  • Rafał Trzaskowski - 30.46%

When we compare these results with the ones from parliamentary elections in 2019, we can see that they are quite similar. In 2019, the parties of these candidates got respectively:

  • Prawo i Sprawiedliwość - 43.59%
  • Koalicja Obywatelska - 27.40%

We try to compare the results of these two elections and check if the preferences of the voters in voivodeships has changed.

Parse the results data

We begin with parsing the data of the 2019 voting for Sejm lists from the National Electoral Commission.

In [16]:
results_voivodeships_percent_2019_df = pd.read_csv('data/results/results_by_voivodeship_percent_2019.csv', sep=';')
results_voivodeships_percent_2019_df = results_voivodeships_percent_2019_df[['Kod TERYT', 'Województwo'] + parties_2019]

for party in parties_2019:
    results_voivodeships_percent_2019_df[party] = results_voivodeships_percent_2019_df[party].map(comma_to_dot)
In [17]:
results_voivodeships_percent_2019_df.head()
Out[17]:
Kod TERYT Województwo KOALICYJNY KOMITET WYBORCZY KOALICJA OBYWATELSKA PO .N IPL ZIELONI - ZPOW-601-6/19 KOMITET WYBORCZY PRAWO I SPRAWIEDLIWOŚĆ - ZPOW-601-9/19
0 20000 dolnośląskie 30.19 38.32
1 40000 kujawsko-pomorskie 28.76 38.39
2 60000 lubelskie 17.44 57.10
3 80000 lubuskie 31.27 34.30
4 100000 łódzkie 24.32 45.87

The geografical data is already parsed.

Integrate the results data with the geografical data

Fix the TERYT code.

In [18]:
results_voivodeships_percent_2019_df['Kod TERYT'] = \
    results_voivodeships_percent_2019_df['Kod TERYT'].astype(str).map(fix_teryt_voivodeship)
In [19]:
results_voivodeships_percent_2019_df.head()
Out[19]:
Kod TERYT Województwo KOALICYJNY KOMITET WYBORCZY KOALICJA OBYWATELSKA PO .N IPL ZIELONI - ZPOW-601-6/19 KOMITET WYBORCZY PRAWO I SPRAWIEDLIWOŚĆ - ZPOW-601-9/19
0 02 dolnośląskie 30.19 38.32
1 04 kujawsko-pomorskie 28.76 38.39
2 06 lubelskie 17.44 57.10
3 08 lubuskie 31.27 34.30
4 10 łódzkie 24.32 45.87

Plot two maps for each candidate

In [20]:
def get_figure_results_by_voivodeship_2019(party):
    """Get figure showing a map of results of the given party by voivodeship."""
    party_df = results_voivodeships_percent_2019_df[['Kod TERYT', 'Województwo', party]]
    party_df.columns = party_df.columns.to_series().apply(simplify_party)
    
    simplified_party = simplify_party(party)
    

    fig = px.choropleth_mapbox(
            party_df, geojson=voivodeships, color=simplified_party,
            locations='Kod TERYT', featureidkey="properties.JPT_KOD_JE",
            center=poland_center,
            opacity=opacity, color_continuous_scale=parties_2019_colors[party],
            hover_data={'Województwo': True, 'Kod TERYT': False},
            mapbox_style="carto-positron", zoom=poland_zoom
    )
    
    fig.update_layout(margin=map_margin)
    
    return fig
In [21]:
for party in parties_2019:
    simplified_party = simplify_party(party)
    
    display(Markdown(f'### Results of {simplified_party} in 2019 by voivodeship'))
    get_figure_results_by_voivodeship_2019(party).show()
    
    candidate = parties_to_candidates[simplified_party]
    
    display(Markdown(f"### Results of the {simplified_party}'s candidate in 2020 by voivodeship"))
    get_figure_results_by_voivodeship(candidate).show()

Results of Koalicja Obywatelska in 2019 by voivodeship

Results of the Koalicja Obywatelska's candidate in 2020 by voivodeship

Results of Prawo i Sprawiedliwość in 2019 by voivodeship

Results of the Prawo i Sprawiedliwość's candidate in 2020 by voivodeship

We can observe that the support for Andrzej Duda and Prawo i Sprawiedliwość party is almost the same. In case of Rafał Trzaskowski and Platforma Obywatelska there are slight differences in the western part of Poland.